Goto

Collaborating Authors

 recovery property



On Iterative Hard Thresholding Methods for High-dimensional M-Estimation Microsoft Research, INDIA

Neural Information Processing Systems

Of the known methods, the class of projected gradient descent (also known as iterative hard thresholding (IHT)) methods is known to offer the fastest and most scalable solutions. However, the current state-of-the-art is only able to analyze these methods in extremely restrictive settings which do not hold in high dimensional statistical models.


Sparse Recovery with Brownian Sensing

Neural Information Processing Systems

We introduce an additional randomization process, called Brownian sensing, based on the computation of stochastic integrals, which produces a Gaussian sensing matrix, for which good recovery properties are proven, independently on the number of sampling points N, even when the features are arbitrarily non-orthogonal.


On Iterative Hard Thresholding Methods for High-dimensional M-Estimation Microsoft Research, INDIA

Neural Information Processing Systems

Of the known methods, the class of projected gradient descent (also known as iterative hard thresholding (IHT)) methods is known to offer the fastest and most scalable solutions. However, the current state-of-the-art is only able to analyze these methods in extremely restrictive settings which do not hold in high dimensional statistical models.



Robust Regression via Hard Thresholding

Bhatia, Kush, Jain, Prateek, Kar, Purushottam

Neural Information Processing Systems

We study the problem of Robust Least Squares Regression (RLSR) where several response variables can be adversarially corrupted. More specifically, for a data matrix X \in \R^{p x n} and an underlying model w*, the response vector is generated as y = X'w* + b where b \in n is the corruption vector supported over at most C.n coordinates. Existing exact recovery results for RLSR focus solely on L1-penalty based convex formulations and impose relatively strict model assumptions such as requiring the corruptions b to be selected independently of X.In this work, we study a simple hard-thresholding algorithm called TORRENT which, under mild conditions on X, can recover w* exactly even if b corrupts the response variables in an adversarial manner, i.e. both the support and entries of b are selected adversarially after observing X and w*. Our results hold under deterministic assumptions which are satisfied if X is sampled from any sub-Gaussian distribution. Finally unlike existing results that apply only to a fixed w*, generated independently of X, our results are universal and hold for any w* \in \R^p.Next, we propose gradient descent-based extensions of TORRENT that can scale efficiently to large scale problems, such as high dimensional sparse recovery. and prove similar recovery guarantees for these extensions. Empirically we find TORRENT, and more so its extensions, offering significantly faster recovery than the state-of-the-art L1 solvers. For instance, even on moderate-sized datasets (with p = 50K) with around 40% corrupted responses, a variant of our proposed method called TORRENT-HYB is more than 20x faster than the best L1 solver.


On Iterative Hard Thresholding Methods for High-dimensional M-Estimation

Jain, Prateek, Tewari, Ambuj, Kar, Purushottam

Neural Information Processing Systems

The use of M-estimators in generalized linear regression models in high dimensional settings requires risk minimization with hard L_0 constraints. Of the known methods, the class of projected gradient descent (also known as iterative hard thresholding (IHT)) methods is known to offer the fastest and most scalable solutions. However, the current state-of-the-art is only able to analyze these methods in extremely restrictive settings which do not hold in high dimensional statistical models. In this work we bridge this gap by providing the first analysis for IHT-style methods in the high dimensional statistical setting. Our bounds are tight and match known minimax lower bounds. Our results rely on a general analysis framework that enables us to analyze several popular hard thresholding style algorithms (such as HTP, CoSaMP, SP) in the high dimensional regression setting. Finally, we extend our analysis to the problem of low-rank matrix recovery.


On Iterative Hard Thresholding Methods for High-dimensional M-Estimation

Jain, Prateek, Tewari, Ambuj, Kar, Purushottam

arXiv.org Machine Learning

The use of M-estimators in generalized linear regression models in high dimensional settings requires risk minimization with hard $L_0$ constraints. Of the known methods, the class of projected gradient descent (also known as iterative hard thresholding (IHT)) methods is known to offer the fastest and most scalable solutions. However, the current state-of-the-art is only able to analyze these methods in extremely restrictive settings which do not hold in high dimensional statistical models. In this work we bridge this gap by providing the first analysis for IHT-style methods in the high dimensional statistical setting. Our bounds are tight and match known minimax lower bounds. Our results rely on a general analysis framework that enables us to analyze several popular hard thresholding style algorithms (such as HTP, CoSaMP, SP) in the high dimensional regression setting. We also extend our analysis to a large family of "fully corrective methods" that includes two-stage and partial hard-thresholding algorithms. We show that our results hold for the problem of sparse regression, as well as low-rank matrix recovery.


On pattern recovery of the fused Lasso

Qian, Junyang, Jia, Jinzhu

arXiv.org Machine Learning

We study the property of the Fused Lasso Signal Approximator (FLSA) for estimating a blocky signal sequence with additive noise. We transform the FLSA to an ordinary Lasso problem. By studying the property of the design matrix in the transformed Lasso problem, we find that the irrepresentable condition might not hold, in which case we show that the FLSA might not be able to recover the signal pattern. We then apply the newly developed preconditioning method -- Puffer Transformation [Jia and Rohe, 2012] on the transformed Lasso problem. We call the new method the preconditioned fused Lasso and we give non-asymptotic results for this method. Results show that when the signal jump strength (signal difference between two neighboring groups) is big and the noise level is small, our preconditioned fused Lasso estimator gives the correct pattern with high probability. Theoretical results give insight on what controls the signal pattern recovery ability -- it is the noise level {instead of} the length of the sequence. Simulations confirm our theorems and show significant improvement of the preconditioned fused Lasso estimator over the vanilla FLSA.